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ABSTRACT 


The  data  files  at  the  Thermophysical  Properties  Research  Center 
(TPRC)  are  stored  on  magnetic  tape  and  are  used  for  mechanized 
retrospective  searches  and  to  produce  the  original  copy  for  the  Center's 
"Retrieval  Guide  to  Thermophysical  Properties  Research  Literature" 
publication.  In  the  near  future,  the  Center  will  be  connected  directly 
to  Purdue's  computer  center  via  a  UHF  radio  circuit.  Partly  because 
of  the  relatively  frequent  changes  in  computer  equipment,  the  Center 
has  limited  its  use  of  mechanized  processes.  The  Director  believes 
that  at  present  his  staff  can  perform  most  searches  faster  by  manual 
reference  than  by  machine. 
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I.  SUMMARY 


The  data  files  of  the  Thermophysical  Properties  Research  Center 
(TPRC)  has  been  entered  on  magnetic  tape.  These  files  are  used  for 
mechanized  retrospective  searches  and  to  produce  the  original  copy  for 
the  Center's  "Retrieval  Guide  to  Thermophysical  Properties  Research 
Literature"  publication.  The  Retrieval  Guide  is  the  ordered  reproduction 
of  all  information  contained  in  TPRC's  files  in  book  form  and  is  available 
to  purchasers  who  desire  a  convenient  tool  for  manual  retrospective 
searches  of  literature  current  with  the  latest  volume  of  the  Guide.  The 
mechanized  search  process  is  primarily  used  by  the  Center  for  covering 
the  literature  in  the  data  base  that  is  current  since  the  latest  publication. 

The  Thermophysical  Properties  Research  Center  was  established 
in  1957  under  the  sponsorship  of  both  government  and  industrial  organi¬ 
zations  to  advance  the  knowledge  of  the  thermal  properties  of  matter. 

It  is  a  separate  unit  within  Purdue  University's  schools  of  engineering. 
(Appendix  A  illustrates  the  organizational  structure  of  the  the  Center.  ) 
The  center  consists  primarily  of  an  interdesciplinary  staff  of  chemists, 
physicists,  chemical  engineers,  and  mechanical  engineers.  This  staff 
operates  in  four  major  areas  of  activity  related  to  thermophysical 
properties: 
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1. 


Scientific  Documentation 


2.  Critical  Tables  of  Properties 

3.  Experimental  Research 

4.  TheoreticalResearch 

The  Center's  data  base  consists  of  information  on  the  thermo- 
physical  properties  of  about  42,  900  substances.  The  properties  are 
divided  into  seven  groups  for  a  total  of  13  properties,  such  as  thermal 

,  J 
l 

conductivity,  specific  heat,  thermal  diffusivity,  etc. ,  which  represent 
the  data  base  at  the  present  time. 

At  present,  the  Center  is  searching  four  abstracting  journals: 
Technical  Abstract  Bulletin,  Scientific  and  Technical  Aerospace 
Reports,  Chemical  Abstracts,  and  Metallurgical  Abstracts.  It  also 
subscribes  to  about  98  scientific  and  technical  journals,  which  are  scanned 
i  by  Center  personnel.  Information  is  also  obtained  by  reviewing  miscel¬ 
laneous  technical  reports,  dissertations,  compendia,  informal  sources, 
etc. 

The  Center  provides  information  to  its  sponsors  and  their  con¬ 
tractors,  and,  on  a  selective  basis,  to  members  of  the  scientific 
community. 
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II.  MECHANIZATION 


1.  CHRONOLOGY 

In  1957  when  the  Center  was  founded,  consideration  of  mechan¬ 
ized  techniques  to  aid  the  Center's  operations  began.  The  logical  flow 
of  the  Center's  developing  documentation  operations  was  designed  to  be 
easily  adaptable  to  computer  processing. 

In  1959,  the  first  program  for  information  storage  and  retrospective- 
searching  was  developed.  Accession  numbers  were  the  only  output,  and 
corresponding  bibliographic  printout  was  accomplished  by  selecting  and 
printing  the  necessary  EAM  punched  cards.  This  program  was  designed 
for  a  Datatron  computer. 

During  1960  and  1961,  the  program  was  rewritten  for  a  Univac 
computer  which  replaced  the  Datatron. 

In  1963,  the  program  was  rewritten  for  the  IBM  7090  computer. 

In  1964,  the  program  was  again  rewritten,  this  time  for  the  IBM  7094 
computer.  In  1966,  the  program  will  again  be  rewritten  for  the  IBM 
360  wh;ch  may  replace  the  IBM  7094. 
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2. 


DESCRIPTION  OF  PROCESSES 


Figure  1  illustrates  the  documentation  system  flow  that  is  sum¬ 
marized  in  the  following  paragraphs: 


(1 )  Input  Procedures 

1.  Abstracting  journals  are  searched,  and  pertinent 
references  in  selected  sections  are  marked  on  an  Abstract 
Search  Record  Card  in  the  columns  representing  the  prop¬ 
erties  and  physical  state  being  reported.  One  card  per 
abstracting  journal  is  used. 

2.  Clerical  assistants  relocate  the  selected  journal,  cut 
out  the  referenced  abstracts,  and  insert  these  abstracts 
into  a  3  by  5  acetate  folder.  Abstracts  that  cannot  be  clippe 
are  photographically  reproduced.  When  the  abstracts  arc 
clipped,  they  are  labeled  with  the  journal  identification. 

3.  The  actual  document  is  procured  in  hard  copy  and 
microfiche  form. 

4.  Coders  then  assign  code  numbers  to  the  abstract  to 
describe  a  total  of  14  items  of  technical  and  bibliographic 
nature.  These  codes  are  entered  in  the  Reference  Coding 
Form  as  shown  in  Figure  2. 
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FIGURE  2 

TPRC  Reference  Coding  Form 
The  coded  are  as  follows: 

(1)  Properties  -  2  digits  for  up  to  99  properties 

(2)  Substance  Class  -  3  digits  for  up  to  999  classes 

(3)  Substance  Name  -  4  digits  for  up  to  9,  999  substances 

in  a  class 

(4)  Physical  State  -  1  digit  for  up  to  9  states 

(5)  Type  of  Subject  -  1  digit  to  indicate  nature  of  cover  - 

Coverage  age  such  as  theoretical,  experi¬ 

mental,  etc. 

(6)  Language  of  Ori-  -  1  digit  for  up  to  9  languages 
ginal  Article 

(7)  Temperature  -  1  digit 
Range 

(8)  Serial  Number  -  7  digits  for  up  to  1 0  million  bib  - 

of  the  Reference  liographic  references 

(9)  Journal  Name  -  5  digits  for  up  to  100,  000  journal 

codes 
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(10) 

Journal  Volume  - 

3  digits 

(11) 

Journal  Number  - 

2  digits  to  indicate  the  serial 
number  within  a  volume 

(12) 

Journal  Series 

1  digit 

(13) 

Beginning  Page 
Number 

6-digits  to  indicate  the  starting 
page  of  the  article 

(14) 

Journal  Year 

3  digits  which  omit  the  thousands 
position  in  the  year  group 

Items  (2)  and  (3)  above  represent  the  basis  of  information 
organization  in  the  Center.  Item  (2),  which  the  substance 
class,  has  three  digits  which  are  assigned  by  series  units 
as  follows: 


Series  000  Work  not  involving  substance 

class 


Example:  Oil -Surveys;  031 -Theory;  061 -Patents 

Series  100  and  Substance  described  by  chemical 

200  formula 


Series  300 


Series  400 


Ferrous  metal  alloys  (alloys 
where  the  amount  of  iron  exceeds 
49  percent  or  is  greater  than  any 
other  single  constituent). 

Nonferrous  metal  alloys 


Series  500  and 
600 


Substances  that  cannot  be  de¬ 
scribed  correctly  in  a  single 
chemical  formula  and  are  not 
metal  alloys. 


Example:  551 -structures  of  intermetallic  and  cer¬ 
amic  compounds;  621 -fabrics  and  yarns 
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One  or  more  lines  on  the  Reference  Coding  Form  are  re¬ 
quired  to  code  each  abstract  depending  upon  whether  one  or 
more  properties,  substances,  or  physical  states  are  dis¬ 
cussed.  A  typical  abstract  requires  four  lines. 


5.  The  information  on  one  line  of  the  Reference  Coding 
Form  is  then  punched  in  sequence  in  the  first  40  columns  of 
an  EAM  card.  The  punched  cards  are  sorted  on  the  first 
nine  columns  by  property,  class  within  property,  and  sub¬ 
stance  within  class. 

6.  The  sorted  cards  are  used  to  update  the  master  mag¬ 
netic  tape  file  once  each  year,  after  which  they  are  discarded. 
Before  the  taping,  the  accumulated  cards  are  used  for 
searching.  In  the  near  future,  it  is  planned  to  update  the 
tapes  monthly. 


(2)  Output 


If  a  retrospective  search  query  is  handled  by  machine,  the 
query  is  refined  and  entered  on  an  EAM  punched  card  called  a 
query  card.  The  query  must  specify  the  name  of  the  material 
and  the  property  of  interest.  Additional  information  may  include 
the  physical  state  of  the  material,  the  subject  coverage,  language 
of  the  original  article,  temperature  range,  and  year  of  publication 


Query  cards  are  then  run  on  the  computer  with  the  appro¬ 
priate  property  file  tape.  In  response  to  the  retrospective  search 
run,  the  computer  punches  an  EAM  card  for  each  located  item. 
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giving  the  item  a  serial  number  and  bibliographic  code  of  zero. 

The  resulting  printout  is  all  of  the  information  on  the  magnetic  tape 
file  pertaining  to  a  specific  substance. 

At  present,  the  draft  of  the  Retrieval  Guide  (Books  2  and  3 
of  the  three -book  volume)  is  prepared  by  requesting  an  entire 
printout  of  all  of  the  property  files.  This  is  done  in  the  form  of 
an  unrestricted  query.  An  example  of  a  search  output  is  given  in 
Appendix  B. 

3.  ACTIVITIES  BEING  PLANNED  OR 

DEVELOPED  FOR  MECHANIZATION 

When  the  system  is  reprogrammed  for  the  IBM  360,  the  Center 
will  be  connected  directly  to  Purdue's  computer  center  via  a  UHF  radio 
circuit.  The  Center  has  already  acquired  a  teletype  and  paper  punch 
unit  and  is  conducting,  on  a  time-sharing  basis,  computational  and 
retrieval  experiments  with  the  National  Bureau  of  Standards  and  others. 
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in.  PROGRAM  SYSTEM  DATA 

' 

1.  MAJOR  FILES 

The  TPRC  data  files  are  stored  on  magnetic  tape,  one  tape  per 
property.  On  each  tape,  the  data  are  in  packed  form  and  are  ordered 
by  substance  classification  (digits  3-5)  and  substance  number  within  the 
classification  (digits  6-9).  Currently,  the  initial  blocks  of  each  tape 
contain  the  directory  for  the  tape.  A  block  consists  of  20  words  of  10 
digits  (and  a  sign  position)  each.  Each  tape  has  a  capacity  of  20,  000 
blocks  or  400,  000  words.  Each  data  entry  consists  of  five  words--four 
words  for  the  data  and  one  for  error  control.  There  may  be  changes 
made  in  this  configuration  in  the  near  future. 

The  entry  represents  14  items  of  coded  information  which  are  read 
into  the  computer  from  an  EAM  punched  card.  The  following  is  the 
format  of  the  data  entry: 


Data  Word 

Digit  Positions 

Information  Coded 

1 

1,2 

Property 

1 

3,4,5 

Classification 

1 

6,  7.  8,  9 

Substance 
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Data  Word 

Digit  Postions 

Information  Coded 

1 

10 

Physical  State 

2 

1 

Subject  Coverage  Type 

2 

2 

Language 

2 

3 

Temperature  Range 

2 

4,  5,  6,  7,  8,  9,  10 

Serial  No. 

3 

1,  2,  3,  4,  5 

Journal 

3 

.  6,  7,  8 

Volume 

3 

9,  10 

Number 

4 

1 

Series 

4 

2,  3,  4,  5,  6,  7 

Starting  Page  of  Article 

4 

8,  9,  10 

Year 

2.  PROGRAMS 

(1)  File  Preparation 

Appendix  C-l  illustrates  the  system  flow  for  the  initial  file 

preparation.  The  data  input  is  on  EAM  punched  cards  prepared 

as  described  in  Section  II  and  sorted  on  the  first  nine  columns. 
Cards  with  identical  digits  in  columns  1  through  5  are  read  into 
the  main  memory  as  a  group.  Then,  one  entry  at  a  time,  they  are 
read  onto  the  tape  file  along  with  a  new  directory  item  at  the  head 
of  tiie  tape.  Error  checking  is  accomplished  after  the  reading  onto 
tape  of  each  entry. 
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When  the  last  card  is  read  from  the  subset  with  identical 
digits  in  columns  3  through  5  (all  the  same  substance  classification 
code),  the  next  group  of  cards  is  read  into  the  main  memory  and 
the  process  iterated.  When  the  last  card  is  read  from  the  set  with 
identical  digits  in  columns  1  and  2  (same  property  code),  the 
program  is  stopped  and  the  tape  changed.  The  iterative  process 
is  then  continued  until  all  of  the  data  cards  have  been  read  onto  the 
respective  tapes. 

(2)  File  Maintenance 

The  program  system  flow  for  the  file  maintenance  run  is 
shown  in  Appendix  C-2.  Punched  cards  containing  new  additions 
for  the  data  base  are  sorted  on  columns  1  through  9  and  then  sep¬ 
arated  on  the  basis  of  identical  property  codes  in  columns  1  and  2. 
Cards  for  one  property  are  then  read  into  the  main  memory. 

The  first  group  of  entries  having  identical  codes  in  positions 
1  through  9  (same  property,  substance  classification,  and  sub¬ 
stance  code)  are  compared  to  the  old  properties'  directory  to 
determine  the  proper  tape  address  block  for  storage.  Data  from 
the  old  property  tape  are  transferred  to  the  update  tape  until  the 
last  old  item  in  the  determined  storage  address  block  is  detected. 
The  old  tape  servo  is  then  stopped  and  the  first  new  item  for  storare 
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is  compared  on  data  words  3  and  4  to  all  the  items  in  the  last 
address  block  just  entered  on  the  update  tape.  This  comparision 
is  a  check  for  preexisting  duplicate  entries.  (A  duplicate  can  exist 
only  if  the  first  data  word  of  two  entries  is  identical.  Since  this 
word  is  already  determined  by  the  storage  address  block,  it  is 
only  necessary  to  compare  on  data  words  3  and  4.  )  If  a  new  entry 
is  not  a  duplicate  of  an  existing  one,  it  is  stored  on  the  update  tape 
at  the  next  available  position  in  the  address  block.  The  update  tape 
is  then  reversed  to  its  head  and  the  tape  directory  block  modified 
to  incorporate  the  new  entry.  Duplicate  items  are  rejected  after 
recording  on  a  punched  card.  This  process  is  iterated  until  the 
last  item  of  the  first  group  with  common  data  words  #1  is  detected. 
The  next  group  with  the  common  feature  is  then  moved  up  for 
storage. 

When  the  last  item  of  the  last  group  has  been  processed,  the 
run  is  ended  for  the  selected  property.  It  is  then  necessary  to 
stack  the  next  set  of  common  property  cards,  to  replace  the  update 
with  a  clean  tape  (or  the  old  property  tape  just  updated),  and  to 
mount  the  corresponding  old  property  tape  next  to  be  updated. 

Note  that  data  on  the  tapes  are  maintained  in  a  packed  arrangement 
with  no  gaps  for  updating  as  a  result  of  the  technique  of  interfiling 
on  a  new  tape  from  both  the  old  tape  and  the  main  memory. 
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(3)  Information  Retrieval 

Search  cards  in  the  same  format  as  the  data  cards  arc  pre¬ 
pared,  specifying  the  parameters  for  which  bibliographic  data  arc 
desired.  These  are  read  into  the  computer  as  illustrated  in 
Appendix  C-3  and  compared  to  item  referring  to  the  same  substance 
(same  data  word  §1)  from  the  appropriate  property  file  tape.  When 
a  match  is  made,  the  item  data  from  the  tape  is  punched  out  on  an 
EAM  answer  card.  The  process  is  iterated  until  all  query  cards 
are  processed  for  one  property.  Additional  properties  may  be  run 
after  correspondingly  changing  the  property  file  tape.  When  all  of 
the  queries  are  processed,  the  answer  card  stack  is  run  in  a  tabular 
form  which  lists  the  bibliographic  data  in  the  appropriate  line 
format. 
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IV.  EQUIPMENT,  COSTS,  AND  EVALUATIONS 


1.  EQUIPMENT 


The  program  as  described  above  was  designed  for  the  computational 
equipment  listed  as  follows: 


Datatron  (Electrodata)  Electronic  Computer  with  4,  000 
ten-digit  words  of  magnetic  drum  storage,  2  magnetic  tape 
drives,  400,  000  (20,  000  blocks)  words  of  storage  per  tape. 

IBM  026  card  punch 
IBM  056  verifier 
IBM  083  sorter 
IBM  514  reproducer 
IBM  407  tabulator 


The  Datatron  was  replaced  by  a  UNIVAC,  then  an  IBM  7090,  and  later, 
by  an  IBM  7094.  Plans  call  for  the  7094  to  be  replaced  by  an  IBM  3C0. 

2.  COSTS 


TPRC  Staff  Programmer  half-time 

Computer  rental  per  hour  $80 


Total  annual  computer  center  cost 


$10,  000 


Average  computer  time  to  print  5  or  6  hours  at 

data  section  of  the  Retrieval  Guide  $40  per  hour 
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The  cost  in  computer  time  to  complete  a  bibliographic  search  is 
less  than  a  minute  to  search  and  40  minutes  to  print  out  a  search  of 
50  pages,  at  an  approximate  print-out  speed  of  800  wpm. 

3.  FACILITY'S  EVALUATION  OF  SYSTEM 

Because  of  t  .  relatively  frequent  changes  in  computer  equip¬ 
ment,  the  Center  has  limited  its  program  development.  In  addition, 
most  retrieval  queries  can  be  answered  by  manuajl  reference  to  the 

K 

Retrieval  Guide  or  by  consulting  with  one  of  the  staff  members. 

Cost  data  resulting  from  computer  usage  is  not  very  meaningful 
because  of  the  relatively  greater  amount  of  effort  involved  in  manual 
preparation  of  a  query  than  is  spent  in  the  mechanized  process. 

4 

The  Center  has  found  it  very  useful  to  provide  its  own  programmer. 

This  arrangement  permits  the  programmer  to  remain  conversant  with 

i 

the  TPRC  technology. 

The  Director  believes  that,  at  present,  he  can  perform  a  search 
faster  manually  than  by  machine.  This  is  with  reference  to  the  total 
"Real  Time,  "  which  includes  all  the  time  consumed  between  the  asking 
of  the  question  and  the  receipt  of  the  information  by  the  requester. 
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